Semantic Context Detection Using Audio Event Fusion

نویسندگان

  • Wei-Ta Chu
  • Wen-Huang Cheng
  • Ja-Ling Wu
چکیده

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this work, hidden Markov models (HMMs) are used to model four representative audio events, that is, gunshot, explosion, engine, and car braking, in action movies. At the semantic context level, generative (ergodic hidden Markov model) and discriminative (support vector machine (SVM)) approaches are investigated to fuse the characteristics and correlations among audio events, which provide cues for detecting gunplay and car-chasing scenes. The experimental results demonstrate the effectiveness of the proposed approaches and provide a preliminary framework for information mining by using audio characteristics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Semantic Context Detection Using Audio Event Fusion: Camera-Ready Version

Semantic-level content analysis is a crucial issue in achieving efficient content retrieval andmanagement.We propose a hierarchical approach that models audio events over a time series in order to accomplish semantic context detection. Two levels of modeling, audio event and semantic context modeling, are devised to bridge the gap between physical audio features and semantic concepts. In this w...

متن کامل

Event Detection in Basketball Video Using Multiple Modalities

Semantic sports video analysis has attracted more and more attention recently. In this paper, we present a basketball event detection method by using multiple modalities. Instead of using low-level features, the proposed method is built upon visual and auditory midlevel features i.e. semantic shot classes and audio keywords. Promising event detection results have been achieved. By heuristically...

متن کامل

IRIT @ TRECVid 2010 : Hidden Markov Models for Context-aware Late Fusion of Multiple Audio Classifiers

This notebook paper describes the four runs submitted by IRIT at TRECVid 2010 Semantic Indexing task. The four submitted runs can be described and compared as follows: • Run 4 – late fusion (weighted sum) of multiple audio-only classifiers output • Run 3 – context-aware re-rank of run 4 using hidden Markov model • Run 2 – context-aware late fusion of multiple audio classifiers output with hidde...

متن کامل

Audio-concept features and hidden Markov models for multimedia event detection

Multimedia event detection (MED) on user-generated content is the task of finding an event, e.g., a Flash mob or Attempting a bike trick, using its content characteristics. Recent research has focused on approaches that use semantically defined “concepts” trained with annotated audio clips. Using audio concepts allows us to show semantic evidence of their relationship to events, by looking at t...

متن کامل

Feature-Level Decision Fusion for Audio-Visual Word Prominence Detection

Common fusion techniques in audio-visual speech processing operate on the modality level. I.e. they either combine the features extracted from the two modalities directly or derive a decision for each modality separately and then combine the modalities on the decision level. We investigate the audio-visual processing of linguistic prosody, more precisely the extraction of word prominence. In th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • EURASIP J. Adv. Sig. Proc.

دوره 2006  شماره 

صفحات  -

تاریخ انتشار 2006